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0 Replication facility. 



© A replication facility provides for the replication 
of files or portions of files in a distributed environ- 
ment. The replication facility is able to replicate any 
subtree within a distributed namespace of the dis- 



tributed environment. The replication facility provides 
multi-mastered, weakly consistent replication. The 
replication facility supports both public replication 
and private replication. 
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Technical Field 

The present invention relates generally to data 
processing systems and, more particularly, to repli- 
cation facilities used within distributed systems. 

Background of the Invention 

Replication facilities have been provided in a 
number of different types of software products. For 
instance, replication facilities have been incorpo- 
rated in database products, network directory ser- 
vice products, and groupware products. Many of 
the conventional replication facilities are limited in 
terms of what they can replicate. For instance, 
many conventional replicators can only replicate 
one type of logical structure (i.e., a file). Further- 
more, the conventional replicators are limited in 
terms of the quantity of the logical structures that 
may be replicated at a time. In particular, many 
conventional replicators can only replicate one file 
at a time. 

Summary of the Invention 

In accordance with a first aspect of a preferred 
embodiment of the present invention, a method is 
practiced in a distributed system having a replica- 
tion facility and a number of computer systems that 
each include a storage device. In this method, a 
plurality of files are provided and organized into a 
tree. A single one of the files is replicated using the 
replication facility such that a copy of the file is 
stored in the storage device of a different computer 
system than the original copy of the file. A subtree 
of files of multiple levels is also replicated. The 
subtree is originally stored on the storage device of 
one of the computer systems. Replication is per- 
formed using the replication facility such that a 
copy of the subtree and its files are stored in the 
storage device in another of the computer systems. 

In accordance with another aspect of the 
present invention, a first copy of a file is provided 
in one of the computer systems. A second copy of 
the file is provided in another of the computer 
systems. The first copy of the file is reconciled with 
the second copy of the file using a reconciler 
facility. The reconciliation ensures that the second 
copy of the file incorporates any changes made to 
the first copy of the file. A first copy of a group of 
files is provided in one of the computer systems, 
and a second copy of the group of files is provided 
in another of the computer systems. The reconciler 
facility is used to reconcile the first copy of the 
group of files with the second copy of the group of 
files so that the second copy of the group of files 
incorporates any changes made to the first copy of 
the group of files since last reconciled. 



In accordance with a further aspect of the 
present invention, a first copy of a group of files is 
stored in the storage device of a first of the com- 
puter systems. A second copy of the group of files 
5 is stored in the storage device of a second of the 
computer systems. Changes are made to at least 
one of the files in the first copy of a group of files. 
The changes are propagated to the second group 
of files upon the occurrence of an event. Additional 

70 changes are made to at least one of the files in the 
first copy of a group of files, and these changes 
are also propagated to the second copy of a group 
of files upon the occurrence of another event. 

In accordance with yet another aspect of the 

75 present invention, a first copy of a group of files is 
stored in the storage device of the first computer 
system. The second copy of the group of files is 
stored in the storage device of a second computer 
system. Any changes made to the first copy of the 

20 group of files are incrementally sent to the second 
computer system so that the changes may be 
made to the second copy of the group of files. 

In accordance with an additional aspect of the 
present invention, a first set of files that are stored 

25 in one of the storage devices is specified to be 
replicated. A filter is specified for determining what 
files in the first set of files are to be replicated. The 
files specified by the filter are replicated using the 
replication facility to produce a second set of files. 

30 In accordance with a still further aspect of the 

present invention, files having names are stored in 
the storage devices of the computer systems of the 
distributed system. A distributed namespace is pro- 
vided. The distributed namespace comprises a 

35 logical organization of the names of the stored files. 
Selected portions of a group of files in the 
namespace are replicated to create new files hold- 
ing the selected portions of the files. 

In accordance with a further aspect of the 

40 present invention, a first copy of a set of files of a 
given class are stored in a first computer system. A 
second copy of the set of files are stored in a 
second computer system. The first copy of the set 
of files is reconciled with the second copy of the 

45 set of files using a class-specific reconciler that 
only reconciles files of the given class. The files 
may be stored as persistent objects, which are 
organized into classes. Objects and classes will be 
discussed below. 

50 In accordance with another aspect of the 

present invention, an application program is run on 
one of the computer systems of a distributed sys- 
tem. A request is made within the application pro- 
gram to a private replication mechanism to repli- 

55 cate a set of files. Each of the files maintains a list 
of processes that are permitted to access the file. 
The set of files is replicated using the private 
replication mechanism to produce a new set of files 
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without replicating the list of processes that are 
permitted to access the file. 

In accordance with a further aspect of the 
present invention, a first copy of a group of files is 
provided in a first computer system and a second 
copy of the group of files is provided in a second 
computer system. Changes are made to the first 
copy of a group of files. An agent is provided for 
the first copy of group of files. Each agent has 
access rights to access and read the files in the 
first copy of the group of files. A reconciler is 
provided at the second computer system for rec- 
onciling the second copy of the group of files with 
the first copy of the group of files. A proxy is 
granted from the agent of the first copy of the 
group of files to the reconciler. The proxy grants 
the reconciler limited authority to access and read 
the files in the first copy of the group of files. The 
reconciler then reconciles the second copy of the 
group of files with the first copy of the group of 
files using the reconciler so that changes that were 
made to the first copy of group of files is also 
made to the second copy of group of files. 

In accordance with a final aspect of the present 
invention, a method is practiced in a distributed 
system. In this method, heterogeneous file systems 
are provided in the distributed system. A storage 
manager is provided for each file system to man- 
age access to the files held therein. In response to 
a request to reconcile a first set of files with a 
second set of files, access is granted to the first 
set of files by the storage manager for the file 
system that holds the first set of files and access is 
granted to the second set of files by the storage 
manager for the file system that holds the second 
set of files. The first object set is reconciled with 
the second object set under the control of the 
storage managers of the respective file systems 
that hold the first set of files and the second set of 
files. 

Brief Description of the Drawings 

Figure 1A is a block diagram of a distributed 
system suitable for practicing a preferred embodi- 
ment of the present invention. 

Figure 1B is a diagram of a distributed 
namespace for a distributed system in accordance 
with the preferred embodiment of the present in- 
vention. 

Figure 2 is a block diagram of a change log 
used in the preferred embodiment of the present 
invention. 

Figure 3 is a block diagram of a replication 
information block (RIB) used in the preferred em- 
bodiment of the present invention. 

Figure 4 is a block diagram illustrating the 
functional components of the replication facility 



used in the preferred embodiment of the present 
invention. 

Figure 5 is a diagram illustrating the interaction 
of elements that play a role in public replication in 
5 the preferred embodiment of the present invention. 

Figure 6 is a flowchart of the steps performed 
in replication in the preferred embodiment of the 
present invention. 

Figure 7 is a flowchart illustrating the steps 
w performed to provide security during replication in 
the preferred embodiment of the present invention. 

Detailed Description of the Invention 

T5 A preferred embodiment of the present inven- 

tion provides a replication facility for use in a 
distributed environment. The replication facility 
supports weakly consistent replication of any sub- 
tree of persistent objects in the distributed 

20 namespace of the system. The replication facility 
may replicate single objects or may replicate logi- 
cal structures that include multiple objects. The 
replication facility reconciles local copies of objects 
with remote copies of objects. Reconciliation oc- 

25 curs on a pair-wise basis such that each object in a 
local set of objects is reconciled with its corre- 
sponding object in the remote set of local objects. 
The reconciliation may occur over heterogeneous 
file systems. 

30 Figure 1A depicts a distributed system 10 that 

is suitable for practicing the preferred embodiment 
of the present invention. The distributed system 10 
includes an interconnection mechanism 12, such as 
a local area network (LAN), wide area network 

35 (WAN), or other interconnection mechanism, that 
interconnects a number of different data processing 
resources. The data processing resources include 
workstations 14, 16, 18 and 20, printers 22 and 24, 
and secondary storage devices 26 and 28. Each of 

40 the workstations 14, 16, 18 and 20 includes a 
respective memory 30, 32, 34 and 36. Each of the 
memories 30, 32, 34 and 36 holds a copy of a 
distributed operating system 38. Each workstation 
14, 16, 18 and 20 may implement a separate file 

45 system. 

Those skilled in the art will appreciate that the 
present invention may be practiced on configura- 
tions other than the configuration shown in Figure 
1A. The distributed system 10 shown in Figure 1A 

so is intended to be merely illustrative and not limiting 
of the present invention. For instance, the intercon- 
nection mechanism 12 may interconnect a number 
of networks together that are running separate net- 
work operating systems. 

55 The preferred embodiment of the present in- 

vention allows users and system administrators to 
replicate persistent "objects." An object, in this 
context, is a logical structure that holds at least one 
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data field. Groups of objects with similar properties 
and common semantics are organized into object 
classes. A number of different object classes may 
be defined for the distributed system 10. Although 
the preferred embodiment of the present invention 
employs objects, those skilled in the art will appre- 
ciate that the present invention is not limited to an 
object-oriented environment; rather, the present in- 
vention may also be practiced in non-object-ori- 
ented environments. The present invention is not 
limited to replication of objects; rather, it is more 
generalized to support the replication of logical 
structures, such as files or file directories. 

The operating system 38 includes a file system 
for storing the objects that are used in the pre- 
ferred embodiment of the present invention. The 
objects are organized into a distributed namespace 
19 (Figure 1B). The distributed namespace 19 is a 
logical tree-like structure formed from the object 
names 21 stored in the file system of the operating 
system 38. The distributed namespace 19 illus- 
trates the hierarchy among the named objects of 
the system 10 (Figure 1A). 

The replication facility of the preferred embodi- 
ment of the present invention provides not only for 
the duplication of objects so that objects may be 
distributed across the distributed system, but also 
provides for reconciliation of multiple copies of 
objects (i.e., multimaster replication). Reconciliation 
refers to reconciling an object with a changed 
object so that the object reflects the changes made 
to the changed object. For instance, suppose that a 
remote copy of an object has been changed and a 
local copy of the object has not yet been updated 
to reflect the changes. Each object not only has 
contents but also has a name and location within 
the distributed file system. Reconciliation involves 
reconciling the two copies of the object such that 
the local copy of the object is changed in a like 
fashion to how the remote copy of the object was 
changed. The term "replication," as used herein, 
refers to not only duplicating objects so that mul- 
tiple copies of the objects are distributed across 
the distributed system 10, but also refers to rec- 
onciliation of the copies of the objects. 

Before discussing the preferred embodiment of 
the present invention in more detail below, it is 
helpful to introduce a few key concepts that will be 
referenced below. An "object set" is a collection of 
objects that are grouped together for replication. An 
object set may include a single object or a sub-tree 
of objects. The object set is specified by the user 
or administrator who requests replication. A "rep- 
lica set," in contrast, is a collection of systems 
which each own a local copy of an object set, and 
a "replica" is a member of a replica set. 

To insulate the replication facility from the un- 
derlying physical storage system (e.g., the type of 



file system employed to store objects) and to pro- 
vide extensibility, the preferred embodiment of the 
present invention adopts the abstraction of a repli- 
cated object store (ReplStore). The ReplStore ab- 
5 straction allows the replication facility to be applied 
across heterogeneous file systems. The ReplStore 
presents a group of interfaces that must be sup- 
ported for an underlying physical storage system to 
support replication facilities. In particular, only 
w those objects that reside in object stores that sup- 
port the ReplStore interfaces can be replicated. An 
interface is a named group of logically related 
functions. The interface specifies signatures (such 
as parameters) for the group of related functions 
75 provided by an interface. The interface does not 
provide code for implementing the functions; rath- 
er, the code for implementing the function is pro- 
vided by objects or by other implementations. Ob- 
jects that provide the code for an instance of an 
20 interface are said to "support" the interface. The 
code provided by an object that supports an inter- 
face must comply with the signature specified with- 
in the interface. Thus, in the example described 
above, the object store that stores the objects in 
25 the object set must support the ReplStore inter- 
faces in order for the object set to be replicated. 
Implementations of the ReplStore interfaces are 
provided for each of the file systems within the 
distributed system 10 in order to support replica- 
30 tion over each of the file systems. 

Each ReplStore provides a mechanism for 
identifying replicated objects on the local volume. 
This mechanism is the replicated object ID 
(ROBID). The ROBID is an abstraction that encap- 
35 sulates the identity as well as other information 
about an object that is being replicated. The Repl- 
Store supports routines for serializing and de- 
serializing ROBIDs. The ROBID of an object pro- 
vides a mechanism for performing numerous oper- 
40 ations. For instance, an object can be retrieved 
from storage using information contained in the 
ROBID. Further, a component name of an object 
can be derived from its ROBID. 

Each ReplStore maintains a replicated storage 
45 change log 40 (Figure 2). The change log 40 
includes a number of change items 42 that specify 
changes that have been made to objects in the 
object set. Each change item 42 includes a type 
field 44, a serialized ROBID field 46 for the object 
so that is changed, a time field 48 indicating the time 
that the change occurred (local time) and a replica- 
tion information block (RIB) field 50 holding a RIB 
that is associated with the change. In the embodi- 
ment described herein, there are five types of 
55 changes that may be specified within the type field 
44. These changes are deletion, creation, modifica- 
tion, renaming, and moving. A deletion occurs 
when an object is deleted. Creation occurs when 
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the object is created. A modification occurs when 
the contents of the object are modified in some 
way. A renaming occurs when the component 
name of the object is modified and moving occurs 
when the object is moved under a new parent in 
the distributed namespace of the system. 

A cursor 49 is maintained within the change log 
40 that acts as an index into the list of change 
items 42. The cursor 49 acts as a marker in the list 
of change items 42. In addition, a change log may 
include multiple cursors. The cursor 49 may take 
the form of a time stamp. The cursor 49 may, for 
example, identify the beginning of changes that 
have occurred after a point in time. 

Every object in an object set that is being 
replicated is stamped with an RIB 51 (Figure 3). 
The RIB 51 has three fields: an originator field 57, 
a change identifier field 55, and a propagator field 
57. The originator field 53 specifies where the last 
change to the object occurred. The change iden- 
tifier field 55, in contrast, identifies the last change 
to the object relative to the originator identified 
within the originator field 53. Lastly, the propagator 
field 57 specifies the identity of the party who sent 
the change to the local site. When an object is 
changed locally, the RIB 51 associated with the 
object is modified to reflect the local site as the 
originator and the propagator. The change identifier 
is stamped appropriately. 

Replication is useful for the distributed system 
10 in that it provides load balancing and availabil- 
ity. Replication provides load balancing by having 
more than one copy of an object stored across the 
distributed system 10 to limit the load on any one 
copy of the object. Replication enhances availabil- 
ity by allowing multiple copies of important objects 
to be distributed across the system 10. The en- 
hanced availability increases the fault resilience of 
the system. Specifically, by having copies of im- 
portant objects distributed across the system 10, 
users are less affected by failures within the sys- 
tem that prevent or limit access to objects. The 
enhanced availability also enhances the perfor- 
mance of the system. 

The preferred embodiment of the present in- 
vention is embodied in a replication facility 54 
(Figure 4) that is part of the operating system 38. 
Nevertheless, those skilled in the art will appreciate 
that the replication facility of the present invention 
may also be implemented in other environments, 
including graphical user interfaces. As shown in 
Figure 4, the replication facility 42 includes three 
primary functional components: a copying compo- 
nent 56, a reconciler component 58 and a control 
component 60. The replication facility 54 uses the 
copying component 56 for duplication. In addition, 
the replication facility 54 reconciles copies of ob- 
ject sets using the reconciler component 58 to 



ensure that they are consistent with each other. 
This reconciliation insures a consistent view of the 
objects across the distributed system 10. 

One level of control exerted by the control 

5 component 56 concerns how replication is invoked. 
Replication may be invoked manually or automati- 
cally. Manual invocation requires that an explicit 
request to replicate be made by a user or other 
party. The user or other party must specify the 

10 object set and the destination for replication. The 
destinations are not specified for each replication 
cycle; rather a replica connection is specified ini- 
tially. The replica connection identifies the two rep- 
licas and the object set that are to be involved in 

75 replication. In contrast, automatic invocation occurs 
when replication is triggered by certain events 67 
(see Figure 5) or by the passage of a certain 
amount of time (which may be construed as a type 
of event). Replication may be prescheduled to oc- 

20 cur at fixed time intervals. Another aspect of control 
exerted by the control mechanism concerns who 
may invoke replication. Replication may be invoked 
by an appropriately privileged party. 

The preferred embodiment of the present in- 

25 vention provides two types of replication: public 
replication and private replication. Public replication 
refers to a process that may be performed only by 
appropriately privileged parties to produce a "pub- 
lic" copy of an object set. In public replication, 

30 each of the copies of the object set that are pro- 
duced cooperates with the other copies to maintain 
consistency. The nodes in the namespace that 
store the public copies, in aggregate, form a public 
replica set, and the members of the set keep state 

35 information to maintain consistency among the 
copies. Access restrictions on the objects are pre- 
served. Changes that occur in a public copy of an 
object set are reconciled with other public copies. 
Private replication refers to a process for pro- 

40 ducing private copies of an object set. A private 
copy may be created by any party, including a 
non-administrator. Not all members of the repli- 
cated sets keep state information to maintain con- 
sistency among copies. Private replication will be 

45 discussed in more detail below. 

A number of elements play a role in the repli- 
cation process in the preferred embodiment of the 
present invention. Figure 5 is a diagram illustrating 
the elements that may play a major role in the 

so public replication process. Object replication agents 
(ORAs) 62 and 64 are replicator objects that act as 
agents on behalf of nodes in which object sets are 
stored to provide automatic support for replication. 
Each machine in the distributed system has its own 

55 ORA. ThG ORAs 62 and 64 may act as remote 
procedure call (RPC) servers that service requests 
made on behalf of remote clients or may alter- 
natively be other types of reliable communication 
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mechanisms that serve a similar role. A separate 
ORA 64 is provided for a local object and another 
ORA 64 is provided for the corresponding remote 
object in the public replication process. Local ORA 
62 is responsible for loading a ReplStore DLL 66 
and a ReplStore Manager DLL 65. The ReplStore 
Manager 65 is responsible for regulating access to 
the ReplStore 66. Clients call the ReplStore Man- 
ager 65 to load the appropriate ReplStore 66 for a 
given physical storage system. The ORAs 62 and r 
64 have a level of privilege that allows them to read 
and write all objects that are being replicated from 
a local object store. The ORAs 62 and 64 are 
responsible for replying to requests to exchange 
changes with other ORAs which maintain public r 
replicas. 

A reconciler 68 also plays a role in the public 
replication process. It acts as a counterpart to the 
local ORA 62 to reconcile the local object set with 
the remote object corresponding. The reconciler 68 21 
is called by the local ORA 62 and is responsible for 
opening objects that are to be reconciled. Two 
types of reconciler objects may be called by the 
reconciler 68. Specifically, a class-specific recon- 
ciler 70 may be called or a default (i.e., class- 2; 
independent) reconciler 72 may be called. The 
class-specific reconciler 70 reconciles objects that 
have class specific requirements on replication. 
The class specific recorder 70 is applied to only a 
class of objects. The class-independent reconciler 3c 
72 reconciles objects regardless of their class. Mul- 
tiple class-independent reconcilers may be avail- 
able in the system 10. For instance, each object 
set may have its own class-independent reconciler. 
Every replica set may be associated with its own 35 
class independent reconciler which is invoked 
whenever a class-specific reconciler is unavailable. 
Lastly, as mentioned above, events 67 may play a 
role in triggering replication. 

Figure 6 is a flowchart of the steps performed 40 
for replication in the preferred embodiment of the 
present invention. Initially, access is gained to a 
change log 40 (Figure 2) for a remote object set 
(step 74 in Figure 6). In particular, when a local 
object set is to be reconciled with a remote object 45 
set. the local ORA 62 (Figure 5) contacts the re- 
mote ORA 64 via a remote procedure call mecha- 
nism. The local ORA 62 contacts the remote ORA 
64 to gain access to the change log 40. A cursor 
49 (Figure 2) is then created in the change log 50 
(step 76 in Figure 6). Specifically, the local ORA 62 
stores a time stamp indicating the time of the last 
reconciliation between the object sets and then 
passes this time stamp to the remote ORA 64 to 
be used as a cursor 49. The remote ORA 64 then 55 
passes this time stamp as a cursor into the remote 
change log 40. The cursor identifies items in the 
change log that have time stamps after the last 



reconciliation and, thus, are of interest for this 
replication cycle. 

A list of change items are then obtained from 
the remote change log utilizing the cursor, to iden- 
tify the change items that are for changes that have 
occurred after the last reconciliation. The remote 
ORA 64 screens the RIBs 51 of each of the change 
items 42 to insure that the remote ORA does not 
pass back to the local ORA 62 changes that origi- 
nated at the local ORA (i.e., the remote ORA exam- 
ines the originator field 53 of the RIBs) and exam- 
ines the RIBs to insure that change items for 
changes that were propagated from the local ORA 
(i.e., the remote ORA examines the propagator field 
57 of the RIBs) are not sent. The resulting change 
items are passed back to the local ORA 62 where 
they are stored persistently. The local ORA 62 then 
uses the reconciler 68 to perform namespace rec- 
onciliation (step 80) and content reconciliation (step 
82) on the objects identified by the ROBIDs in the 
change items. In particular, the reconciler 56 recon- 
ciles each object that has changed in the remote 
object set with corresponding objects of the local 
object set. Any changes that have been made to 
the remote object are made to the corresponding 
local object. Whether the class-specific reconciler 
70 or the class-independent reconciler 72 is used 
depends upon the source (i.e., remote copy of an 
object). A class specific reconciler 70 is used only 
if the remote copy of the object requires such a 
reconciler. 

Namespace reconciliation is performed (see 
step 80 in Figure 4) for any change recorded in a 
change item that is not strictly a content modifica- 
tion or that is not associated with a system prop- 
erty. Such changes include creations, deletions, 
moves, and renames. Namespace reconciliation 
occurs by comparing information obtainable by 
ROBIDs of local objects relative to information 
stored for corresponding remote objects. Many dif- 
ferent ways for resolving name resolution conflicts 
may be used within the present invention. The 
preferred embodiment of the present invention, 
however, adopts rules. A first rule used by the 
preferred embodiment of the present invention to 
resolve namespace conflicts is to select a last 
modification over a previous modification. When an 
object is moved/renamed at one site to have a first 
name, and the same object is moved or renamed 
to another site to have a different name, the last 
occurring change is chosen so that the object 
assumes the name associated with the last change. 
A second rule is used to resolve namespace colli- 
sions. A namespace collision occurs when two dif- 
ferent objects are created, moved, or renamed to 
have the same name. The second rule specifies 
that whichever object was created, moved, or re- 
named first is the name that is selected for the 
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object at the local site. 

Content reconciliation (see step 70 in Figure 
4) involves reconciling contents of a local object 
with a remote object so that the local object in- 
cludes the modifications made to the remote ob- 5 
ject. By examining the changes in the change log, 
the local objects may be changed to have the 
same contents as the remote objects. 

During replication, changes are propagated 
from one replica to another. Replication is "one io 
way" in that the changes made to an initial copy of 
an object set are made to a second copy of the 
object set. There is no immediate reciprocal action 
to copy the changes made to the second copy of 
the object set to the first copy of the object set. is 
Nevertheless, such propagation to the first copy of 
the object set may be performed. Given this one 
way nature of replication, each replica monitors 
how up to date a local copy of an object set is for a 
replica, cursors are maintained into partner change 20 
logs. At the completion of each exchange during 
reconciliation, the two replicas exchange cursor 
information. 

Public replication poses a number of security 
issues. In general, reconcilers must be able to 25 
update objects in order to perform replication. The 
class-independent reconciler is a trusted system 
process, and, thus, does not pose a security risk. 
Class-specific reconcilers, however, are not trusted 
system processes, and thus pose a security threat. 30 
To help alleviate this security dilemma, the pre- 
ferred embodiment of the present invention utilizes 
"proxies". 

A proxy is a delegation ticket that allows work- 
er processes or remote processes that perform 35 
well-defined operations without having extraordi- 
nary privileges. The proxy packages credentials of 
the granting party and lends them to the parties 
seeking access to remote objects. The party seek- 
ing access may then step in the shoes of the 40 
granting party and access the necessary objects. 
These credentials may be encrypted. Figure 7 is a 
flowchart of the steps performed to utilize a proxy 
in the preferred embodiment of the present inven- 
tion. During the replication process, the remote 45 
ORA 64 (Figure 5) gives a local reconciler 68 a 
proxy (step 84 in Figure 7). As mentioned above, 
this proxy includes the appropriate credentials and 
access rights that are to be granted by the remote 
ORA to the local reconciler. The reconciler 68 then so 
sends the credentials to the remote site (step 86 in 
Figure 7). In other words, the reconciler 68 
presents the proxy to the remote site. The remote 
site then validates the credentials, and if the cre- 
dentials are valid, grants limited access to the 55 
objects within the remote copy of the object set in 
question (step 88). The reconciler 68 then gains 
access to the remote objects in the object set (step 



80). The local reconciler's range of access, how- 
ever, is limited to only that which is necessary to 
perform proper reconciliation. It should be appre- 
ciated that the present invention is not limited to 
exclusively using proxies. Any technique that 
grants secure access, such as making each ORA a 
member of a common access group that grants 
access rights, is permissible. 

Most of the above discussion has focused on 
public replication. Private replication is similar to 
public replication but includes a number of differ- 
ences. In private replication, the source of changes 
does not maintain a record of what objects were 
duplicated or changed. There is no state informa- 
tion maintained at the source. The source is not 
responsible for advising that changes have oc- 
curred. Accordingly, the resources that are required 
for public replication are not required. These char- 
acteristics make private replication especially ap- 
propriate for instances where manual control of 
replication is desired, or instances wherein the cost 
of maintaining a public copy of an object set is not 
warranted. 

While the present invention has been de- 
scribed with reference to a preferred embodiment 
thereof, those skilled in the art will appreciate that 
the various changes in form and detail may be 
made without departing from the scope of the 
present invention as defined in the appended 
claims. For example, the present invention need 
not be implemented in an object-oriented environ- 
ment and need not be practiced solely in a distrib- 
uted system configuration like that shown in Figure 
1A. Furthermore, communication mechanisms oth- 
er than RPC mechanisms may be used for remote 
interactions, and security mechanisms other than 
proxies may be employed. 

Claims 

1. In a distributed system having a replication 
facility and a number of computer systems that 
each include a storage device, a method com- 
prising the steps of: 

providing a plurality of files organized into 
a tree of files; 

replicating a single one of the files that is 
stored in the storage device of one of the 
computer systems using the replication facility 
so that a copy of the file is stored in the 
storage device of another of the computer sys- 
tems; and 

replicating a subtree of files of multiple 
levels, from the tree of files, that is stored in 
the storage device of one of the computer 
systems using the replication facility so that a 
copy of the subtree of files is stored in the 
storage device of another of the computer sys- 
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terns. 

2. The method of claim 1, further comprising the 
step of replicating the single file using the 
replication facility so that a copy of the file is 
stored in the storage device of an additional 
one of the computer systems. 

3. The method of claim 1, further comprising the 
step of replicating the subtree using the repli- 
cation facility so that a copy of the subtree is 
stored in the storage device of an additional 
one of the computer systems. 

4. The method of claim 3 wherein the subtree 
being replicated includes at least three levels 
of files. 

5. A distributed system comprising: 

a plurality of computer systems, each 
computer system including a storage device 
for storing files; 

a namespace manager for managing a 
namespace of the system as a tree structure 
of names of the files; and 

a replication facility for replicating any files 
comprising a subtree of the namespace. 

6. In a distributed system having a reconciler 
facility and a number of computer systems, a 
method comprising the steps of: 

providing a first copy of a file in one of the 
computer systems and a second copy of the 
file in another of the computer systems; 

reconciling the first copy of the file with 
the second copy of the file using the reconciler 
facility so that the second copy of the file 
incorporates any changes made to the first 
copy of the file since last reconciled; 

providing a first copy of a group of files in 
one of the computer systems and a second 
copy of the group of files in another of the 
computer systems; and 

reconciling the first copy of the group of 
files with the second copy of the group of files 
using the reconciler facility so that the second 
copy of the group of files incorporates any 
changes made to the first copy of the group of 
files since last reconciled. 

7. The method of claim 6 wherein the step of 
reconciling the first copy of the group of files 
with the second copy of the group of files 
further comprises the step of reconciling on a 
pair by pair basis each file in the first copy of 
the group of files with a corresponding file in 
the second copy of the group of files. 



8. In a distributed system having a replication 
facility and a number of computer systems, 
each including a storage device, a method 
comprising the steps of: 

s providing a first copy of a group of files 

stored in the storage device of a first of the 
computer systems; 

providing a second copy of the group of 
files stored in the storage device of a second 
70 of the computer systems; 

making changes to at least one of the files 
in the first copy of the group of files; 

propagating the changes to the second 
copy of the group of files upon the occurrence 
75 of an event; 

making additional changes to at least one 
of the files in the first copy of the group of 
files; and 

propagating the additional changes to the 
20 second copy of the group of files upon the 

occurrence of another event. 

9. The method recited in claim 8 wherein the 
event is the elapsing of a predetermined time 

25 period. 

10. The method recited in claim 9 where the other 
event is also the elapsing of a predetermined 
time period. 

30 

11. The method of claim 8 wherein the event is a 
request by the second computer system to 
receive the changes. 

35 12. The method of claim 11 wherein the other 
event is a request by the second computer 
system to receive the additional change. 

13. The method recited in claim 8, further compris- 
4 ° ing the step of reconciling the second copy of 

the group of files with the first copy of the 
group of files so that the second copy of the 
group of files incorporates the changes made 
to the first copy of the group of files. 

45 

14. The method recited in claim 13, further com- 
prising the step of reconciling the second copy 
of the group of files with the first copy of the 
group of files so that the second copy of the 

50 group of files incorporates the additional 

changes made to the first copy of the group of 
files. 

15. In a distributed system having a replication 
55 facility and computer systems that each in- 
clude a storage device, a method comprising 
the steps of: 

storing files, having names, in the storage 
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devices of the computer systems; 

providing a distributed namespace com- 
prising a logical organization of the names of 
the stored files; and 

replicating selected portions of a group of 
files stored in the storage devices of one of the 
computer systems and whose names form a 
part of the distributed namespace using the 
replication facility to create new files holding 
the selected portions of the files. 

16. The method recited in claim 15, further com- 
prising the step of replicating the new files to 
distribute the new files across at least a portion 
of the computer systems of the distributed 
system. 

17. In a distributed system having a first computer 
system and a second computer system, a 
method comprising the steps of: 

providing a first copy of a set of files of a 
given class that are stored in the first computer 
system; 

providing a second copy of the set of files 
of the given class that are stored in the second 
computer system; 

reconciling the first copy of the set of files 
with the second copy of the set of files using a 
class-specific reconciler that only reconciles 
files of the given class. 

18. The method recited in claim 17, further com- 
prising the steps of: 

making changes to the first copy of the set 
of files; 

reconciling the first copy of the set of files 
with the second copy of the set of files using a 
class-independent reconciler that reconciles 
files regardless of class. 

19. In a distributed system having a private repli- 
cation mechanism and computer systems for 
running processes that each include a storage 
device, a method comprising the steps of: 

running an application program on one of 
the computer systems; 

making a request to the private replication 
mechanism to replicate a set of files within the 
application program, each of the files maintain- 
ing a list of processes that are permitted to 
access the file; and 

replicating the set of files using the private 
replication mechanism to produce a new set of 
files without replicating, for each file, the list of 
processes that are permitted to access the file. 

20. In a distributed system having a first computer 
system and a second computer system, a 



method comprising the steps of: 

providing a collection of files at the first 
computer system; 

in response to a request to replicate the 
5 collection of files to the second computer sys- 

tem, determining whether all or none of the 
files in the collection should be replicated; 

where it is determined that all of the files 
in the collection should be replicated, replicat- 
io ing alt of the files in the collection so that a 

replica of the collection is provided at the 
second computer system; and 

where it is determined that none of the 
files in the collection should be replicated, rep- 
75 Heating none of the files in the collection. 

21. In a distributed system having a first computer 
system and a second computer system, a 
method comprising the steps of: 
20 providing a first copy of a group of files in 

the first computer system; 

providing a second copy of the group of 
files in the second computer system; 

making changes to the first copy of the 
25 group of files; 

providing an agent for the first copy of the 
group of files, wherein each agent has access 
rights to access and read the files in the first 
copy of the group of files; 
30 providing a reconciler at the second com- 

puter system for reconciling the second copy 
of the group of files with the first copy of the 
group of files; 

granting a proxy to the reconciler from the 
35 agent of the first copy of the group of files, 

said proxy granting the reconciler limited au- 
thority to access and read the files in the first 
copy of the group of files; and 

reconciling the second copy of the group 
40 of files with the first copy of the group of files 

using the reconciler so that the changes made 
to the first copy of the group of files are made 
to the second copy of the group of files. 

45 22. In a distributed system, a method comprising: 
providing heterogeneous file systems in 
the distributed system; 

providing a storage manager for each file 
system to manage access to files in the file 
so system; 

in response to a request to reconcile a first 
set of files with a second set of files, granting 
access to the first set of files by the storage 
manager for the file system that holds the first 
55 set of files and granting access to the second 

set of files by the storage manager for the file 
system that holds the second set of files; and 
reconciling the first object, set with the 
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second set of objects under control of the 
storage managers of the respective file sys- 
tems holding the first set of files and the 
second set of files. 

23. The method of claim 22 wherein each copy of 
a file stored in the file systems is provided a 
storage-specific identifier by the storage man- 
ager. 

24. The method of claim 22 wherein each storage 
manager reports changes to the files in its file 
system. 

25. The method of claim 24 wherein the changes 75 
include deletions of files. 

26. The method of claim 24 wherein the changes 
include renaming of files. 

20 

27. The method of claim 24 wherein the changes 
include moving of files in the distributed sys- 
tem. 

28. The method of claim 24 wherein the changes 25 
are reported to a change log and wherein the 
step of reconciling is performed using the 
change log. 

29. The method of claim 22 wherein each copy of 30 
a file is assigned to a unique identifier and 
wherein the step of reconciling includes com- 
paring identifiers to determine which files are 
to be reconciled. 

35 
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